In this work, we assumed the the task of both predicting and quickly dissecting the 311 NYC call volume data set. There are several viable ways to make in-roads in this work, including triaging by Agency and Complaint Type. Ultimately, we were interested in teasing out macro behaviors and began our explorations without any strong bias beforehand. Certain salient aspects were easily distilled: annual seasonality (different by agency), the importance of the actual day of the week and intra-day patterns. Our team successfully applied random forest regression to predict next-day call volume by complaint type; this process was largely influenced by the day of the week, weather and trailing call volume. This process revealed several clusters of agency-complaints with high correlation. Departing from the need to distill macro behaviors, we also explored the comportment of one, rather unique complaint: Grafitti. This complaint hints as certain societal and generation norm differences.
We plot a time series by agency, by year. In this plot We discovered a seasonal pattern. It has a similar behavior in each year. The number of complains increases in winter and decreases in summer.
Additionally we explored the distribution of complaints per day for the entire time frame to discern if we have any obviously trends which might hint concentration on some particular days. The calendar chart helps us visualize not just the distribution per day, but per month every year to detect any seasonal patterns.
Our second question is if the number of complaints could be explained by geographical information, particularly where occur them. In the following map we show May of 2015, We could recognize an important number of complaints in downtown Manhattan and uptown.
Now, we know that time and position are some important factors in the number of requirements. However there are additional variables or interactions that can explain the behavior of our complaints such as, the borough and the type of requirement.
Using GoogleViz, we can combine our plots to create an interactive dashboard. Our first conclusion is that 311 data have top ten important complaints that represent more than fifty percent of all of them. In addition We use a tree Map to represent the number of complaints by borough. This can show us that the complaints in Queens are related with street condition, but eh others claim by Heating. You can click on the interest subject and return with right click on the gray header.